Finding trends and statistical patterns in name mentions in news

نویسندگان

  • Abigail Mae C. Jayin
  • Rene C. Batac
چکیده

We extract the individual names of persons mentioned in news reportsfrom a Philippine-based daily in the English language from 2010-2012. Names areextracted using a learning algorithm that filters adjacent capitalized words and runs itthrough a database of non-names grown through training. The number of mentions ofindividual names shows strong temporal fluctuations, indicative of the nature of “hot”trends and issues in society. Despite these strong variations, however, we observe sta-ble rank-frequency distributions across different years in the form of power-laws withscaling exponents α = 0.7, reminiscent of the Zipf’s law observed in lexical (i.e. non-name) words. Additionally, we observe that the adjusted frequency for each rank, orthe frequency divided by the number of unique names having the same rank, shows adistribution with dual scaling behavior, with the higher-ranked names preserving theα exponent and the lower-ranked ones showing a power-law exponent α′ = 2.9. Wereproduced the results using a model wherein the names are taken from a Barabasi-Albert network representing the social structure of the system. These results suggestthat names, which represent individuals in the society, are archived differently fromregular words.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مطالعۀ الگوهای جمعیت‌شناختی و رفتاری خوانندگان برای اشاعۀ گزینشی اخبار

Purpose: The current research focuses on selective dissemination of news and aims at finding patterns for recognition of readers’ favorite news through web mining technique. Method: Data for this research was collected from the Yahoo News Website. The source of news was Associated Press. 840 news dated between 2011/3/1 and 2011/5/10 was analyzed through subject clustering technique. Findings:...

متن کامل

Finding Potential News from Trends Originating in the Blogosphere

Tracking current population interests by trends in online media of entities and topics has become increasingly popular. But while notable world events often spur online public discussion, some have been observed originating in social media postings. A natural question arises: Can analysis of social media trends be used to find mainstream newsworthy material? The work reported here takes initial...

متن کامل

Thematic Progression Patterns in the English News and the Persian Translation

Thematic progression pattern as the method of development of the text insures that the reader follows the right path in understanding the text; in this regard, this subject is attracting considerable interest among discourse analysts. This paper calls into question the status of thematic progression in the process of translating English news into Persian. With this in mind, we analyzed the them...

متن کامل

A Probabilistic Model for Canonicalizing Named Entity Mentions

We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens co...

متن کامل

Frame Labeling of Competing Narratives in Journalistic Translation

Studying translations during the time of conflict has gained currency in the recent decade in translation studies. One of the cases in which conflict manifests itself is in the way different countries choose to name an event or a geographical location, for example. This study set out to understand how translation of rival names and labeling was carried out in Iranian state-run news agencies. To...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1507.02449  شماره 

صفحات  -

تاریخ انتشار 2015